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Abstract. A network of observable, macroscopic cosmic (super-)strings may 
well have formed in the early Universe. If so, the cusps that generically develop on 
cosmic-string loops emit bursts of gravitational radiation that could be detectable 
by gravitational- wave interferometers, such as the ground-based LIGO/ Virgo 
detectors and the planned, space-based LISA detector. Here we report on two 
versions of a LISA-oriented string-burst search pipeline that we have developed 
and tested within the context of the Mock LISA Data Challenges. The two 
versions rely on the publicly available MultiNest and PyMC software packages, 
respectively. To reduce the effective dimensionality of the search space, our 
implementations use the F-statistic to analytically maximize over the signal's 
amplitude and polarization, A and if), and use the FFT to search quickly over burst 
arrival times tQ. The standard F-statistic is essentially a frequentist statistic that 
maximizes the likelihood; we also demonstrate an approximate, Bayesian version 
of the F-statistic that incorporates realistic priors on A and V- We calculate 
how accurately LISA can expect to measure the physical parameters of string- 
burst sources, and compare to results based on the Fisher-matrix approximation. 
To understand LISA'S angular resolution for string-burst sources, we draw maps 
of the waveform fitting factor [maximized over (A,i/j,tQ)] as a function of sky 
position; these maps dramatically illustrate why (for LISA) inferring the correct 
sky location of the emitting string loop will often be practically impossible. In 
addition, we identify and elucidate several symmetries that are imbedded in this 
search problem, and we derive the distribution of cut-off frequencies / max for 
observable bursts. 



PACS numbers: 04.30.Tv, 04.30.Db, 04.80.Nn, 98.80.Cq, 11.27.+d 

1. Introduction 

There arc several mechanisms by which an observable network of cosmic (super) strings 
could have formed in the early Universe. Basically, string formation arises from the 
breaking of some U(l) symmetry (either global or local) as the Universe expands and 
cools. In the 1980s and 1990s, interest was primarily in cosmic strings arising from 
grand unified theories [T] , but in recent years several string-theory-inspired inflationary 
models have also been shown to populate the Universe with a network of cosmic-scale 
strings [2J[3]. For instance, brane-inflation models can naturally lead to the breaking 
of U(l) symmetries at the end of inflation, leading to the formation of both long 
fundamental strings and D(k + l)-branes that wrap around k compact dimensions 
and extend in one of Nature's three large spatial dimensions. These long strings 
can be stable on cosmological timescales (depending on the exact model) and could 
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reasonably have string tensions in the range 1CP 12 < /z < 10~ 6 . We refer the reader 
to [1] for a nice review of the main physical ideas. 

Simulations have shown that string networks rapidly approach an attractor: the 
distribution of straight strings and loops rapidly becomes independent of its initial 
conditions. The network properties do depend on two basic parameters of the strings, 
the string tension p and the string reconnection probability p. The distribution of loop 
sizes at their birth should in principle be derivable from p and p, but the huge range of 
scales makes this a very difficult problem to solve via simulations, and today the typical 
loop size at birth (as a fraction of the Hubble scale) is still uncertain by many orders 
of magnitude. We refer the reader to Allen [5] for a brief, pedagogical introduction to 
string networks, and to Vilcnkin and Shellard pQ for a more comprehensive review. 

Once formed, string loops oscillate and therefore lose energy and shrink due to 
gravitational-wave (GW) emission. The spectrum of this GW background radiation 
is calculated to be roughly flat over many orders of magnitude in frequency, including 
the frequency bands where current ground-based GW interferometers (like LIGO and 
Virgo) and planned space-based GW interferometers (like LISA) are sensitive. It is 
conventional to express the energy density pew of GWs in terms of 

Pc a In/ 

where p c is the Universe's closure density. The current limit on f2cw(/) from pulsar 
timing is f2cw(/ ~ 2.5 x 10~ 7 Hz) <4x 10~ 8 6 , and the limit from first-generation 
ground-based interferometers is S1gw(/ ~ 100 Hz) < 6.9 x 10 -6 [7]. For comparison, 
the Advanced LIGO detectors should be capable of detecting a stochastic background 
with Ogw(/ ~ 40 Hz) > 10~ 9 7J, while LISA should be capable of detecting a 
string-generated background ft G w(/ ~ 10- 4 -10~ L5 Hz) > 10" 10 [S]. (For LISA, this 
threshold is set not by detector noise, but instead by the background from short-period 
Galactic binaries.) 

In addition to this broadband stochastic background, Damour and Vilenkin [9], [TO] 
pointed out that the kinks and cusps that form on cosmic strings produce short 
GW bursts that could also be detectable for a large range of string parameters p 
and p. Kinks are discontinuities in the string's tangent direction, which arise when 
strings overlap and interconnect, while cusps are points on the string that become 
instantaneously accelerated to the speed of light. The portion of string near the 
cusp beams a burst of linearly polarized GWs in a narrow cone around the cusp's 
direction of motion. Damour and Vilenkin showed that, for current and planned GW 
interferometers, cusp bursts should be significantly more detectable than kink bursts, 
so for the rest of this paper we focus on the former. GW bursts from string cusps have 
a universal shape h(t) oc |t — t c \ 1/3 , or equivalently h(f) = A\f\~ 4/3 e 27Tiftc . (More 
precisely, for observers that are not exactly at the center of the radiation cone, h(f) 
carries a cut-off frequency / max which also smooths out h(t) at t = tc\ see Sec. 2 
below.) 

Searches for cosmic-string bursts in LIGO-Virgo data are already being carried 
out, though to date there have been no detections [TT]. However it is easy to see that 
the planned space-based GW detector LISA should be far more sensitive to string 
bursts than any current or planned ground-based instrument, due to two factors. To 
understand the first, recall that the matched-filtering signal-to- noise ratio (SNR) for 
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any burst is given by 



I 



/ 2 |fe(/)| 2 d(log/) 
fS h (f) 



SNR 2 



(2) 



for any single detector with noise spectral density Sh(f), up to geometrical factors ~ 1. 



Thus, for bursts with \h(f)\ oc /" 4 / 3 , we have (roughly) SNR oc f^ 1/3 /[f b S h (f b )} 1/2 , 



where f b is the frequency where the detector has its best sensitivity. The value of 
f b ~ 1/3 ' /[hShifb)] 1 ' 2 is ~ 10 times higher for LISA than Advanced LIGO, largely due 
to LISA's much lower sensitive frequency band. The second factor arises from the fact, 
discussed in Sec. 2, that a burst's cut-off frequency / max scales as a^ 1 / 3 , where a is 
the angular separation between the beam direction (which is along the instantaneous 
direction of the cusp's motion) and the observer's line of sight. From this, we will show 
in Sec. 2.4 that the rate of bursts arriving at the detector, and satisfying / max > //,, 
scales as f b 2 ^ 3 . Hence, based on a uniform Euclidean distribution of sources, we can 
estimate that the distance to the closest burst that enters a detector's band scales as 
rp n j g j g a ^ gQ a f ac ^ or ^ 20 higher for LISA than Advanced LIGO. So we conclude 
that in any given year, the strongest burst detected by LISA will have an SNR a factor 
~ 100 larger than the strongest burst detected by Advanced LIGO. Clearly, LISA's 
much lower frequency range is a major advantage for string-burst searches. 

While individual bursts are relatively featureless, as Polchinski [3] emphasizes, 
many burst detections would give us an approximate spectrum dN/dp — ap@ (where 
N is the number of detections and p is their SNR), and the two measured parameters 
a and /3 in principle determine the fundamental string parameters p and p, at least 
for networks that are dominated by a single type of string. (However we note that in 
the large region of parameter space for which the strongest observed bursts would be 
much closer than the Hubble distance, the exponent (3 must be very close to —4, and 
so measuring /3 may not be very constraining on the underlying string parameters; see 
Sec. 2.3.) Also, there are large regions of parameter space for which LISA would detect 
both individual string bursts from cusps and the broadband stochastic background 
from loop oscillations [12) . Clearly the measured energy density of the background in 
the LISA band would place one additional constraint on the string model. 

Since the gravitational waveforms from cusps are both very simple and rather 
precisely known, it is natural to search for them using matched filtering. As we explain 
in more detail in Sec. 2, for any set of string parameters, one can easily compute the 
SNR 2 , which is essentially a measure of how well the model waveform (i.e., template) 
matches the data. Then, roughly speaking, finding the best-fit parameters is a matter 
of maximizing the SNR 2 over the six-dimensional source-parameter space. For three 
of the parameters (the signal's amplitude A, polarization ip, and arrival time ic), 
this maximization can be performed almost trivially, using a combination of the F- 
statistic and the FFT. For the remaining three parameters (the two angles giving the 
source's sky position, and the cut-off frequency / max ), we made use of two publicly 
available optimization codes: PyMC [13 , a Python implementation of Markov Chain 
Monte Carlo integration, and MultiNest [TH [T5] , a Fortran 90 implementation of a 
multimodal nested-sampling algorithm [16] . Employing two different optimization 
algorithms allowed us to carry out useful cross-checks. For high-SNR cases, we were 
able to recover Fisher-matrix error estimates, as expected. 

We tested our searches using data sets from the recent third Mock LISA Data 
Challenge (MLDC) [T71IH]- Both our PyMC and MultiNest searches performed well in 
locating the global SNR maxima in parameter space, and our best-fit SNRs were within 
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1% of the true SNRs for all MLDC3 cases. The sources proved difficult to localize 
correctly on the sky, but, as we show in Sec. 3, that was due to near-degeneracies 
intrinsic to the problem, rather than to a failure of our searches. 

Two other reports on LISA string-burst searches, also developed and tested in the 
context of MLDC 3, have appeared recently OHO]. Our work differs p from those 
in several ways: First, we use the F-statistic and FFT to improve search efficiency 
Second, we present an in-depth analysis of waveform overlap (maximized over A, tj), 
and t c ) as a function of sky position. This analysis clarifies why, for most LISA cusp- 
burst detections, the source's sky location is likely to be very poorly constrained by 
the data. Third, we analyze in detail some aspects of the problem that heretofore 
have not been carefully explored, including a suite of nearly exact symmetries (most 
of which were not previously noted) , and the expected distribution of the maximum 
frequency in observed cusp-bursts. 

Other authors have recently focused on other possible kinds of GW signatures 
from cosmic strings: DePies and Hogan [3T] pointed out that for very small string 
tensions (10~ 19 < [i < 10~ n ), GWs might be detected from the oscillations of 
individual nearby strings, thanks to the nearly periodic nature of loop oscillations, 
and to the gravitational clustering of string loops near our Galaxy. Leblond and 
colleagues [35] showed how the breaking of metastable cosmic strings could result in 
detectable GW signals. In this paper, however, we restrict attention to searches for 
cusp-bursts. 

The plan for the rest of this paper is as follows: In Sec. [2] we briefly review the 
general form of a GW burst emitted by a cosmic-string cusp, as well as the associated 
signal registered by LISA. We also review how to maximize SNR cheaply over the 
extrinsic parameters A, ip, and tc, using the ^-statistic and the FFT (both standard 
tricks), and we introduce an approximate Bayesian version of the F-statistic, which is 
only slightly harder to compute than the standard variety. Finally, we digress slightly 
to discuss the expected distribution of / max for observable sources. In Sec.|3]we discuss 
the near-degeneracies in the space of burst signals (and therefore in source parameter 
space), which significantly impact one's ability to infer the true source parameters 
from a measurement: to wit, there is a discrete near-symmetry between sky locations 
that arc reflections of each other across the plane of the LISA detector; in addition, 
a typical signal from a generic sky location can be mimicked to surprising accuracy 
by templates corresponding to a broad swath of very distant points on the sky, if 
the amplitude, polarization and arrival time of the templates are adjusted suitably. 
In Sec. [4] we give brief reviews of the MCMC and nested-sampling search concepts, 
and we describe the particular tunings of these methods that we found to be efficient 
for our GW burst searches. In Sec. [5] we describe the efficacy and accuracy of our 
searches in the MLDC data sets. We summarize our results and conclusions in Sec. [U 
Throughout this paper we use units where G = c = 1; all quantities are expressed in 
units of seconds (to some power). 

2. Theoretical background 

2.1. The gravitational waveform from cosmic-string bursts 

The GWs arriving at the detector from string-cusp bursts are fully characterized by 
six parameters: the source's sky location (given in the MLDCs as the ecliptic latitude 
(3 and longitude A), the burst's overall amplitude (at the detector) A, the polarization 
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a(/)h :;r x ' (4) 



ip, the burst's time of arrival tc, and the upper cut-off frequency /max- 

If we fix the direction k of GW propagation (i.e., we fix j3 and A) and we let ej- 

and efj be a pair of orthogonal polarization basis tensors for waves traveling along k, 
the general burst waveform is expressed most simply in the Fourier domain as 

hij(f) = [^4 + A2e l\ A(/)e 2 " /tc , (3) 
where we adopt the MLDC approximation for A(/), 

/"* f<fn 
y-Sgi-///^ />/„ 

In terms of these variables, .4 and ip are given by 

A = vW + W, ^ = arctan (A 2 /A 1 ) , (5) 
and in order of magnitude, 

^-^T-' /max ^ 2/(a 3 L), (6) 

where is the string tension, L is the characteristic length of the cosmic string, Dl is 
the luminosity distance to the cusp, and a is the angle between the observer and the 
center of the beam, which points along the cusp's instantaneous velocityjf] 

2.2. Maximization over the extrinsic parameters 

The SNR can be maximized analytically over the parameters A and ip using a version 
of the F-statistic, while the FFT provides a highly efficient method to maximize SNR 
over tc- Let us work out the details, beginning with the F-statistic. Consider the 
space of cusp-burst waveforms, and fix the parameters 8 = (/3, \,tc, /max)- We shall 
build a statistic that is equal to the log-likelihood maximized over the vector space of 
all (A 1 , A 2 ). This statistic is a straightforward adaptation of the method employed in 
the (more complicated) cases of circular-orbit binaries [33J and GW pulsars [M[ [25] . 

The LISA science data will consist of the time series of laser-noise-canceling TDI 
observables ([26], and references therein); all the available information about GWs 
can be recovered from a basis of three such observables, such as A, E, and T j27[ [28] 
(these three are especially expedient since they have uncorrelated noises). Thus we 
represent the detector output as the vector s = (s^(f), Se(£), sr(t)J , and we define 
the natural inner product on the vector space of all possible LISA signals (see, e.g., 

ESI), 

( U i v) = 2 r ^(nvApdf + (integrals for E and T) ; (7) 

J-oo Z>A\J) 

where Sa(J) is the single-sided noise spectral density for the observable A (and 
similarly for Ssif) and Sr(/)) ■ Assuming Gaussian noise, the log-probability of 
any noise realization n is then just (— l/2)(n | n), and therefore the log-likelihood of 
the data s given the signal model h is (— 1/2) (s — h | s — h). 

J What Damour and Vilenkin actually show is that <x /~ 4 / 3 for / <C /max, and that 

falls to zero exponentially for / S> /max- Equation |4]| follows the signal model implemented in the 
LIGO Algorithm Library (LAL) to generate burst injections. This model is more precise than Damour 
and Vilenkin's description, though not necessarily very accurate. For consistency, the MLDCs adopted 
the LAL model. 
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Now, both polarization components of the burst produce a linear response in the 
three TDI observables, 



A 1 A(f)e 2 ^ tc e+ -> A 1 (f+ , F+ , F+ ) A(f)e 2 ^ = A^tc), (8) 
A 2 A(f)e 2 ™ ftc e* -> A 2 (f*,F*,F*) A(f)e 2 * lftc = A 2 h 2 (t c ); 



here the F^g T are linear time-delay operators that encode the LISA response to 



plane GWs (see OEOj, as well as the discussion in Sec. 3.1 ). The time delays change 
continuously as the LISA constellation orbits the Sun, but in the limit of short-lived 
GWs, LISA can be considered stationary, and the delays fixed. Thus, the operators 
can be represented as frequency-dependent complex factors F^'^ T (tc, f), which are 
the analogs of antenna patterns for ground-based interferometers. For cosmic-string 
bursts, this approximation is justified by the fact that most of the SNR is accumulated 
over several thousand seconds, to be compared with the one-year timescale of the LISA 
motion. In our searches, however, we always compute the full LISA response in the 
time domain, using Synthetic LISA [26 . 

The best-fit values of A 1 and A 2 are those that minimize 

(s - A^itc) - A 2 h 2 (t c ) | s - A^tc) - A 2 h 2 (t c )) • (9) 

It is easy to show that the optimized A 1 and the log-likelihood log L are given by 

^(r-^h^is), (io) 

logi = -\ [<s I s> - (F-y (h^ c ) | s)(h,(t c ) I s>] + const., (11) 
where the constant in Eq. |TT| ) is just the logarithm of a volume factor, and where 

r ii (t c ) = <h < (t c )|h i (t c )). (12) 

For any given data s, the term (s | s) is also a constant; the remaining piece of logL, 
which depends on h, is known as the F-statistic, and it is given by 

F^^T-y(h t (t c )\ S )(h J (t c )\s). (13) 

In the limit of high SNR, F ss SNR 2 /2, while in the absence of GWs the expectation 
value of F is 1. (It is 2 for GW pulsars, but in that case the F-statistic is maximized 
analytically over twice as many parameters.) 

Using the FFT to maximize SNR over the time of arrival is also a standard 
technique [31] . Here we merely review the implementation details for our case. We 
arrive at the best-fit tc [for a given ((3, A, / max )] by a simple, iterative scheme. We 
make an initial estimate t@ (e.g., by an initial search step in which the source is 

assumed to be at the ecliptic North pole), and we compute h^\f) and h 2 °\f) using 
the time-delay operators evaluated for that time. Next, we calculate the overlap 
integrals (hi(tc) | s) at times tc — t£ + At by taking the inverse Fourier transform, 



c 

(0), 



SaV) 



Approximating as the constant Tij(tjS ), we have 

F(4 0) + At) = -(r- 1 ^))^^ +At) |s)(h,(4 >+Af) | S ). (15) 
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Of course, the advantage of this approach is that we can use the FFT to obtain 
F(i^P + nSt) cheaply for all integers n, where St is the sampling time. We can now 
find the value n = Ub that maximizes F, fit a parabola to the values of F at the points 
n b — 1, rib, an d rib + 1, and locate Atb at the maximum of the parabola. We then 
set 4 1 ' -> t ( c + At b> replace (T' 1 ^))^ by (r" 1 ^))^, and iterate. The reason 
we are iterating is that we need to account for the change in the time-delay operators 
over the time At; in practice, we always find that the original estimate Vq is within 
~ 500 s of the true tc (see Sec. 4.1 ), and that a single iteration determines the best-fit 
t c to ~ 0.01 s. (That is, further iterations change tc by < 0.01 s.) 

This completes our account of the maximization of log-likelihood over the 
parameters (A,ip,tc)- The search over the remaining parameters (/?, A, / max ), is 
discussed in Sec. 14.11 



2.3. Bayesian version of the F-statistic 

As emphasized above, the F-statistic maximizes the log-likelihood over the parameters 
A and ip. However, since we have prior information on their distribution, it makes 
sense to use it to improve their estimation, as well as detection performance. As shown 
by Prix and Krishnan [32] . it is straightforward to construct a Bayesian version of F 
(which we shall call Fg) that incorporates the prior knowledge. The exact form of Fb 
is somewhat unwieldy, but in this paper we show how to construct an approximate 
version that is only slightly harder to compute than the standard F-statistic, and that 
is quite accurate for reasonably high SNR (i.e., for the cases of greatest interest). 

Given the LISA data s, let P(Q, A, ip\s) be the posterior probability of the source 
parameters [with 9 = ((3, X,tc, / rm )]- As per Bayes' theorem, 

P(Q,A,ip\s) oc P (s\Q, A, ip)P(®, A, ip), (16) 

where the first factor on the right is the likelihood of measuring s given the parameters, 
and the second is the prior parameter distribution. Given rotational invariance (no 
preferred source direction, no preferred polarization, and no preferred angle between 
our line of sight and the cusp velocity vector), and given the scaling of / max with 
the observing angle a given in Eq. (JTjj) (which implies that the solid angle a da is 
oc /max 3 4fmax), the prior must have the general form 

P(0, A, i>) dQ dA # = (sin /3 d/3) d\ dt c (/- a 5 x /3 d/max) (17) 
x (w(A)dA)dip, 

where w{A) is a function of A that encodes cosmological information. For simplicity, 
in the rest of this paper we shall set w(A) = A~ A , as appropriate for a uniform 
distribution of strings in Euclidean space (.4 oc r _1 , where r is the distance to 
the source, implies r 2 dr oc A~ A dA). This is a reasonable approximation for light 
strings (/i < 10 -8 ), for which the strongest bursts that LISA observes would occur 
at z < 1. It is straightforward to modify the calculation below to treat any other 
form of w(A). The Bayesian version of the F-statistic corresponds to integrating 
the posterior P(Q,A, ip\a) over A and tp, as opposed to maximizing the likelihood 
for the regular F-statistic. Fixing the data s and the parameters G, let be 
the best-fit waveform with the Ab and ipb that minimize (s — h | s — h) . Defining 
Ah = h(9, A b , ipb) ~ h(6, A, ip) = h b - h, we have 

(s - h | s - h) = (s - h fa + Ah | s - h b + Ah) 
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(18) 
(19) 

here Eq. (181 holds because Ah lies in the (A 1 , A 2 ) vector subspace, to which s — h b 



(s - h b | s - h 6 ) + (Ah | Ah) 
(s | s) - IF + (Ah | Ah) ; 



is orthogonal thanks to the best-fit condition, and Eq. ( 19 ) follows from the very 
definition of F. Thus, the Bayesian Fb is defined by 



e F B (0) = e F(&) 



J e -(A h \A h )/2 A -4 dA 



dtp, 



or equivalently 



F B (G)=F(Q)-log 



e -r V SA*5A^ /2 A -5 dA l dA 2 



(20) 



(21) 



where we have changed variables from (A, tp) to (A 1 , A 2 ), defined (A\,A 2 ) to be the 
best-fit values of the amplitude parameters and 5 A 1 = A 1 — A\, used the definition of 
T 1 - 7 , and transformed volume elements using the standard identity dA 1 dA 2 — AdAdtp. 
We shall now introduce an approximation that is appropriate in the limit of high SNRs, 
for which the exponential e - T ii SAlSA3 / 2 becomes ever more peaked around SA l = 0. 
We therefore expand A~~ 5 around Ab, discarding all terms higher than quadratic: 

5A%(A- 5 )\ Ab + ^8A i SA j didj(A~ 5 )\A b ■ (22) 



A 



-5 



A: 



Note that this approximation effectively regularizes the divergence of P(@, A, tp \ s) 
as A — > 0, which arises from the A~ 4 factor in the integrand. This divergence 
is unphysical anyway; it originates in the assumption of an infinite Euclidean 
universe, and so it is basically another version of Olbers' paradox. If we had used 
a cosmologically sensible prior, such as one based on an FRW universe, there would 
have been no divergence in the first place. 



Because of symmetry, the linear term (and indeed all odd terms) of Eq. ( 22 1 brings 
no contribution to the Gaussian integral. Compared to the zeroth-order term, the 
contribution of the quadratic term is suppressed by O(SNR) -2 , and the contribution 
of the quartic piece by O(SNR) -4 , which justifies neglecting the latter. The remaining 
integral is trivial: defining 



A, 



1 



AldMA^W 



™A- 



(A b )i(A b )j 



-A~ 2 S 



(23) 



we have 



A: 



■ r « M<M, / a [1 + X^diSA^diSA 2 ) = 2 7 r^ 5 (detr)- 1 / 2 [l + A„-(r- 1 ) 1 -''] , (24) 



and therefore 



F B =F-51ogA- ^logdetT 



log [l + X l0 {T- l r] 



(25) 



where we have ignored the constant log7r term, which is irrelevant to searches. 
Aesthetically, the reader may prefer to multiply the integral by a constant scale factor 
s 3 , where s is typical size for A and the A 1 (e.g., 10 -21 ), and then work with rescaled 
ij , and Xij , so that these are all within a few orders of magnitude 



versions of A 
of unity: A z - 



A\ r. 

; A/s, A 



A l /s, Ti 



This leads to an 



equivalent representation of Fb, given by Eq. (251 after replacing all variables with 
their barred version. 

The effect of the "Bayesian correction" terms in Fb is to penalize fits that have 
relatively larger amplitude parameters A 1 . This is precisely what we should expect: 
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Figure 1. Expected distribution djV/ci(log(/ max ) of the maximum burst 
frequency / max for the string bursts detectable by LISA. 



since the amplitudes scale as 1/r, larger A must come from strings that inhabit smaller 
volumes around the detector, which is a priori less likely. Note also that the terms 
involving (or its inverse or determinant) incorporate the effects of the detector 
response, and therefore depend on sky location; for the same A 1 , they penalize sky- 
locations for which the LISA response is relatively poorer. 

Ironically, our Bayesian correction is not quite appropriate for the sources in 
MLDC data sets, which have SNRs drawn from a uniform distribution, so that farther 
sources are not more likely that nearby ones, and sources from sky locations with 
a poor LISA response are equally likely to be detected. Thus, while our Fb (or its 
analog with a better cosmological model) would be optimal for a real search, it does 
not minimize the expected parameter-estimation error for our MLDC entries. 



2.4- Distribution o// max for detected bursts 

As an enlightening application of the distribution of burst parameters given in Eq. 
(17 1, we estimate the distribution of the cut-off frequency / ma x for the cosmic-string 



bursts that LISA would actually detect; i.e., for the bursts whose SNR is above some 
detection threshold pth- We shall see that for most detections / max is in-band and is 
< 50 mHz. Since this subsection is something of a digression from the main flow of 
this paper, we are content with providing a sketch of the derivation. 

The first step is to change variables from A to p, where p is the SNR of the 
observation (the other five parameters remain the same). Clearly p on A. For 
simplicity, we estimate p in the low- frequency approximation to the LISA response |29| . 
In this approximation, the response functions factorize into a frequency-dependent 
term times an angle-dependent term, so we can write 



p = Ar/(f max )K(/3, A, ip) 



(26) 



where n is a known function of the angles (/3, A, ip) whose precise form is irrelevant, 
and 



A 2 (/)ff 
S h (f) 



1/2 



(27) 
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where A(/) was defined in Eq. Q, and Sh{f) [unlike the SA,E,T(f) of Eq. tffh] includes 
the frequency-dependent LISA response. The Jacobian of the transformation is just 
(t/k) -1 . Integrating the prior over all the angles, over the observation time, and over p 
from the detection threshold pth up to oo, we are left with the probability distribution 
of detectable bursts 

dN/df max a /~ a 5 / V (/ max ) . (28) 

In Fig. [I] we plot the function dN/d(\og / max ). To evaluate 77, we used the Sh(f) fit 
given in Eqs. (26)-(31) of |33j . which includes confusion noise from unresolved white- 
dwarf binaries, and for simplicity we approximated A(/) as /~ 4//3 6(/ ma x — f)> with 
©(/max — /) the Heaviside function. As / max increases above ~ 10 mHz, rj remains 
nearly constant, so at these higher frequencies dN/d(log / max ) scales as /max 3 - We 
find that the median value of / max is 12 mHz, and that ~ 2/3 of detected string bursts 
will have / max €E [5, 50] mHz. 

For this calculation we have assumed the "uniform, Euclidean" prior on the 
amplitude, w(A) oc _4~ 4 ; however it should be clear that the qualitative conclusion 
would remain the same even if most detected bursts were at cosmological distances. 
Of course, the results for the case of ground-based detectors like LIGO and Virgo 
would be completely analogous: the median / max for detected string bursts should be 
a factor ~ 2-3 higher than the frequency where Sh(f) is at a minimum. Since for both 
ground-based and space-based GW detectors / max will be in-band for most observed 
bursts, it seems worthwhile to devote more effort to determining the precise shape of 
h(f) around / max (instead of just patching together a power law with an exponential, 
as is currently done). 



3. Near-symmetries and overlap maps 

3.1. Sky-position reflection across the LISA plane 

There is a degeneracy in the LISA response to short-duration, linearly polarized GW 
sources that are located at sky positions related by a reflection across the LISA plane, 
as first noted in [19] . This degeneracy becomes exact in the limit of infinitely short (and 
linearly polarized) GW signals. To understand how this degeneracy arises, we recall 
that the GW response of the laser-noise-canceling TDI observables can be written 
as [25] 

TDI(i) =^c A y (s/r)A (i- A A ), (29) 

A 

where the y s i r {t) denote the one-way phase measurements along the six LISA laser 
links; the sir triplet (a permutation of 123) indexes the laser-sending spacecraft, the 
link, and the receiving spacecraft (see Fig. 3 of [25J); the are time delays (sums 
of the inter-spacecraft times of flight), and ca — ±1. Each phase measurement y s i r 
registers plane GWs according to 

_ ni (t) ■ [h(t s - k ■ Ps (ts)) - h(t - k ■ Pr (t))] ■ h t (t) 

2(1- k-hi{t)) 

To parse this equation, it is useful to think about the effect of GWs on a single laser 
pulse received at spacecraft r at time t: the unit vector k points along the direction 
of GW propagation; h is the GW strain tensor at the solar system barycenter (SSB), 
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which is transverse to k; the p s ,r(t) are the positions of the sending and receiving LISA 
spacecraft; the hi(t) oc p r (t) —p s (t s ) are the photon-propagation unit vectors; and the 
retarded time t s is determined by the light-propagation equation t s = t—\p r (t)—p s (t s )\. 
Thus, the GW strain tensor h is projected onto hi at the events (t,p r (t)) and (t s ,p s (t s )) 
[the reception and emission of the pulse]. For plane GWs, the value of h at those 
events is obtained by giving h the appropriate retarded-time arguments t — k ■ p r (t) 
and t s - k ■ p s (t s ). 

Because the Pi(t) evolve on the LISA orbital timescale of a year, LISA can be 
considered stationary with respect to signals of much shorter duration. In that case, 
the three p s ,n evaluated at the time when the signal impinges on LISA, define a plane 
that contains the six hi. Without loss of generality, let us then express all geometric 
quantities in an (x,y,z) coordinate system where the LISA plane lies along x and 
y. We reflect the source position across the LISA plane by setting k z — > —k z , and 
multiplying h on both sides by diag(l, 1, —1); this has the side-effect of rotating the 
polarization angle ip of the source(§] Because the hi have no z component, all the dot 
products that appear in Eq. p0| ) are unchanged, except for the retarded h times: but 
since the spacecraft positions p r ^ s can be written as a vector in the (x,y) plane plus 
the position vector of the LISA center, R = (pi +P2 +£>3)/3, the overall effect is that 
TDI(i) acquires an additional delay of — 2k ■ R. 

To summarize, a linearly polarized burst from some given direction is almost 
perfectly mimicked, in the LISA data, by a burst whose incidence direction is reflected 
across the LISA plane (as determined at the time when the GWs impinge on LISA), 
and whose polarization and arrival-time at the SSB are suitably rotated and time- 
translated, respectively. This degeneracy is immediately evident as the reflection 
symmetry across the equator in all the plots in Fig. [5j which examines the F-statistic 
structure for the strongest source in the noiseless training data set. Even for the full 
LISA response (without any assumptions of stationarity), the reflection symmetry is 
accurate to better than one part in 10 6 (in FF), which means that SNRs ~ 1,000 
would be required to discriminate between the two sky positions. 

3.2. Broad F -statistic quasi- degeneracy across the sky 

Our searches revealed an additional, approximate degeneracy in the (A, tj), tc)- 
maximized overlap (i.e., the F-statistic) between linearly polarized burst signals 
incoming from an arbitrary sky position, and templates spread in broad patterns 
across the sky. This approximate degeneracy appears even if we use all three noise- 
uncorrelated TDI observables A, E, and T (see e.g. [IS]), and it is worse (i.e., more 
nearly degenerate) for bursts with lower / max . 

While the reflection degeneracy discussed in the last section has a clear 
counterpart in the analytical expression of the LISA response to polarized, plane GW 
waves, this broad degeneracy seems harder to understand analytically. To explore it, 
in Fig. [2] we present a representative set of fitting-factor (FF) sky maps: each map 
corresponds to a target signal with the sky position and polarization indicated by 

§ For a suitable definition of the polarization angle (as given in Appendix A of 26 ), the rotation is 
just tp — > —ip- Now, a generic non-linearly polarized signal can be described by the linear combination 
of two orthogonally polarized signals; the effect of the reflection considered here is then not just an 
overall rotation, but also a relative sign change between the two polarizations. This destroys the 
reflection degeneracy for generic sources, unless yet another source parameter can be adjusted to 
reverse the sign change. 
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the dot (and with unit amplitude and arbitrary arrival time); the contours in each 
map represent the overlap between the target signal and templates across the sky, 
maximized over the amplitude, polarization, and arrival time of the templates. By 
definition, — 1 < FF < 1, but for our signals FF is very close to one across much 
of the sky, so we actually graph — log 10 (l — FF) (e.g., contour "4" corresponds to 
FF = 0.9999). In all maps (and to label each map) we use latitude and longitude 
coordinates defined with respect to the instantaneous LISA plane. To compute the 
FFs, we work with the frequency-domain representation of burst waveforms and of 
the LISA response, modeling the LISA formation as a stationary, equilateral triangle; 
this is the same approximation was used in |28| to compute LISA sensitivity curves. 
(Unequal armlengths will change the FFs somewhat, but our maps are roughly 
consistent with the probability distributions found in our searches, which used a full 
model of the LISA orbits.) 

Looking at Fig. [2j and specifically at the large square multiple plot at the top 
(corresponding to a target source with latitude (3 = 7r/3), we observe a high-FF cell 
around the true position of the target source (the dot), with a mirror cell reflected 
across the LISA plane, at /3' = — f3. The two cells sit on a "circle in the sky" of higher 
FF; unlike the case of two ground-based interferometric detectors, this pattern cannot 
be explained by simple timing considerations, but originates from a more complicated 
matching of geometric elements. One side of the circle crosses the equator with higher 
FF, and indeed our searches often yield broken-circle distributions. In the limit of the 
target source moving to the equator, the two cells coalesce into one; for a target source 
at the pole, the maps exhibit symmetries that oscillate between two- and four-fold as 
a function of polarization. The bottom panel shows that FFs are considerably closer 
to one for bursts with lower-frequency cut offs, although the structure of the maps is 
qualitatively the same. The appearance of double linked circles in some maps is due 
to the fact that the highest displayed FF contour is set at 0.9999 (indexed by "4"); 
single circles would be seen to form at even higher FF. 

We note that Figure [2] presents sky maps for reduced ranges of the target source's 
A and if), which are however representative of the full ranges. Because of a number of 
symmetries, the map for any j3 and A can be obtained by appropriately shifting and 
reflecting one of the maps in the figure. To wit (and as exemplified in Fig. |3J): 

(i) Rotating the source's sky position by 2tt/3 around an axis perpendicular to the 
LISA plane is equivalent to relabeling the three LISA spacecraft (and the TDI 
observables) , so the available geometric information about incoming GW signals 
must remain the same. Therefore map[/3, A + 27r/3, tp] can be obtained by shifting 
map[/3, A, ip] circularly by 2ir/3 along A'. This degeneracy was first mentioned in 

(ii) Furthermore, there is symmetry in the geometric relation between the LISA 

spacecraft and sources on either side of a LISA triangle bisector: v * . With 
the definition of polarization given in [26], this results in map[/3, A, ip] reproducing 
map[/3, 27r/3 — A, —-0], modulo a A' reflection and circular shift by 27r/3. 

(iii) Moving on to polarization, letting %p — > tp + tt / 2 amounts to reversing the sign of 
the polarization tensor, a change that is absorbed by the F -statistic. It follows 
that map[/3, A, xjj + n/2] = map[/3, A, tp]. 

(iv) Last, there is a non-obvious symmetry corresponding to reversing the sign of k 
and tp for both target source and templates (i.e., to considering signals incoming 
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Figure 2. FF maps for high- (top) and low- frequency (bottom) bursts: 
— log 10 (l — FF) contours are computed between (/3,\,ip) target sources (with 
P = 0, tt/3, vr/2, A 6 [0,7r/3], ip G [0,71-/4]) and (/3',A') templates across 
the sky (/?' £ [— 7r/2, 7r/2], A' 6 [— 7T,7t], each small square). Because of the 
symmetries discussed in Sec. |3,2| these A and ip ranges exhaust the variety of 
maps seen across their entire ranges. The target-source latitude /3 = 7r/3 is also 
representative of latitudes intermediate between the equator fi = and the pole 
/3 = 7r/2. At the equator, rjj has no effect on the maps (except for tp = 7r/4, 
where there is no LISA response); at the pole, A is degenerate, and iji is defined 
consistently with the A = meridian. See a zoomable version of this image at 
http : //seadragon . com/view/lmj 
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Figure 3. Symmetries between FF maps, as explained in the main text, 
exemplified for the case of /? = 7r/6, A = 7r/9,^ = 7r/6. 



from the antipodal sky position). Because the burst GWs are invariant w.r.t. time 
inversion about tc, it turns out that the LISA response to (— k, —ip) signals equals 
the time-inverted and time-shifted response to the original (fc, ip) signals (see the 
Appendix). Now, the inner product Q is manifestly invariant w.r.t. the time 
inversion and translation of both u and v ! Thus, this results in map[/3, A + 7r/3, 
reproducing map[/3, A, ip], modulo a circular shift by 7r/3. 

Perhaps the most concise way to characterize the breadth of the degeneracy pattern 
is to plot, for each map, the fraction of the sky with FF below a given level. We do 
this in Fig.[4j where each of the superimposed lines corresponds to a choice of A and ip 
across their entire ranges; the target source latitude is kept fixed to the representative 
value of 7r/3. We can see that for high-frequency bursts (left plot), roughly half of the 
sky has FF > 0.995, and 2% (about 800 square degrees) has FF > 0.9999. The plot 
is even more dramatic for low-frequency bursts, where around 25% has FF > 0.9999. 
The significance of high FFs with respect to the determination of the source's sky 
position is roughly as follows: for the likelihood of any sky position to decrease by 
a factor e, FF must descend below 1 — l/SNR,Q pt , where SNR opt is the optimal SNR 
for a given source. Thus FF > 0.9999 contains the relevant uncertainty region for 
SNR ~ 100. 



3.3. Effects of degeneracies on searches 

The broad quasi-degeneracy pattern is observed clearly in the posterior probability 
plots produced by our MultiNest runs (see Sec. |4.2[ ). Figure [5] was obtained for the 
strongest source (with an SNR ~ 78) in the noiseless^]] MLDC 3.4 training data set. 
In the left-panel sky map, the density of the dots is proportional to the posterior, 

|| In a truly noiseless data set, the source SNR would be infinite, and it would be possible to determine 
its source parameters exactly. Figure[5]is instead produced with the usual statistical characterization 
of noise, for a noise realization that just happens to be identically zero. 
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Figure 4. Fraction of the sky with FF(A + E + T) > 1 — W~ x , for target-source 
ft = 7r/3, and uniformly distributed (A,i/i), where each pair corresponds to one of 
the superimposed curves. The curves were obtained by generating 40 X 40 maps 
as for Fig. [2] assigning a weight to each pixel corresponding to its area in the 
sky, sorting the resulting sequence by increasing FF, and computing normalized 
cumulative weights. 




Figure 5. Posterior-probability structure for the strongest source (#3) in the 
noiseless training data set from MLDC 3.4. Left: in this sky map, the density 
of dots (MultiNest equal-weight "resamples") is proportional to the posterior 
probability, maximized over A., ip and tc, and marginalized over / max . Crosses 
mark the true location of the source, and its LISA-planc-reflected counterpart. 
The map is plotted in the area-preserving Mollwcidc projection, which we adopt 
throughout the rest of this paper. Right: F-statistic as a function of ecliptic 
latitude and longitude, for the same sky locations as in the left panel. Here F is 
offset by a constant ~ 3, 029, and it is only slightly higher for the neighborhoods 
of the true and reflected sky locations than for the arcs connecting them. 



maximized over A, ip and tc, and marginalized over / max . As expected, the dots 
cluster around the true and reflected locations, but they extend around a thick circle 
that cuts through the instantaneous LISA plane at the time of the burst. In the right 
panel, we see that the F-statistic decreases only slightly across the circle. 

Of course, detector noise will somewhat modify the noiseless posterior 
distribution. Figure [6] shows the posteriors computed for the noisy MLDC 3.4 
training data set, and for five more data sets with the same source and different 
noise realizations. Because FFs are consistently high across the circle, it is possible 
for detector noise to displace the best-fit sky location by large angular distances, while 
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Figure 6. Effect of different noise realizations on the posterior-probability 
structure for the strongest source (#3) in the noisy MLDC 3.4 training data set, 
and in five more data sets with the same source and different noise realizations. 
The additional data sets were created using lisatools |34| with the MLDC 3.4 
noise priors, but different pseudorandom-number seeds. 



significantly altering the structure of the circle. 

In three of the plots of Fig. [6] the best-fit point ends up very close to the 
instantaneous LISA plane. Now, sources from those locations elicit a strongly 
suppressed response in the TDI observables, because they come close to being cross- 
polarized with respect to the LISA arms. However, by construction the F-statistic 
will raise the template amplitude correspondingly to achieve a good fit to the signal, 
as shown in the left panel of Fig. [7] for the strongest source (#3) in the (noisy) MLDC 
3.4 training data set. Thus, a "straight" maximum-likelihood search can easily lead 
to a best-fit A that is orders of magnitude larger than its true value. We have dubbed 
this phenomenon a mirage, because it makes sources appear much stronger and closer 
than they truly are. 

It seems that mirages were not noticed by the other research groups who 
participated in the MLDC 3 searches for string-cusp bursts [THIHO]- We conjecture 
that the reason is as follows. While the F -statistic provides the best-fit A and i/j 
for any sky location and / max , the other groups used stochastic algorithms that treat 
all parameters alike. Since the mirage occurs in regions of parameter space that are 
far removed from the true parameters, and in a subspace in which the A and tp 
parameters are rather precisely correlated, it is difficult for these searches to end up in 
these regions. (Given sufficient time, they would arrive there, but if one did not know 
that the mirages existed, one could easily be fooled into thinking that the search had 
converged before it actually had.) 

Such mirages motivated our development of the Bayesian i^g-statistic (Sec. 2.3), 
which penalizes the large-amplitude, nearby-source fits that are a priori very unlikely. 
Best-fit sky locations are correspondingly pushed away from the instantaneous LISA 
plane, as illustrated in the right panel of Fig.[7]for the six signal-cum-noise realizations 
of Fig. [6] Unfortunately, while Fb does tend to disfavor mirage-like fits, it does not 
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Figure 7. Left: the best-fit value for the template amplitude, as computed by 
the F-statistic, increases dramatically for sky positions close to the instantaneous 
LISA plane, as shown here for source #3 in the noisy MLDC 3.4 training data 
set. Right: the Bayesian Fg-statistic shifts the best-fit sky locations away from 
the instantaneous LISA plane, as seen here for the six data sets of Fig. [6] In some 
cases, the best-fit location moves to the other side of the sky; this is not significant, 
given that reflected points have essentially the same posterior probability against 
the same source. 



necessarily lead to best fits that are any closer to the true locations. The broad quasi- 
degeneracy described in Sec. |3.2| implies that good fits exist over much of sky, even 
when Bayesian priors are called into play. 

4. Search methods 

4-1- Markov Chain Monte Carlo 

Markov Chain Monte Carlo (MCMC) methods are used to efficiently integrate (and by 
extension, explore) arbitrary functions / defined over moderate-to-large-dimensional 
spaces with complex or computationally expensive integration measures P [35] , when 
neither analytic techniques nor simple gridding techniques are feasible. MCMC 
methods work by creating a Markov chain of points that are asymptotically distributed 
according to P. Each next point in the chain is chosen by proposing a new candidate 
randomly as a function of the current point, and by choosing either the current point 
or the candidate on the basis of an appropriate criterion that involves their P. For 
any function / with finite expectation value with respect to P and for sufficiently long 
chains, the average value of / on the chain approaches the P-weighted average of / 
on the full space. 

In applications of MCMC methods to Bayesian inference in signal analysis (35] , 
P is typically the posterior probability. In this paper, P is either e F or e Fs , 
evaluated on the 3-dimensional parameter space (/3, A, f max )i or sometimes a subspace. 
Our Metropolis-Hastings MCMC searches were performed using the PyMC software 
package [T3] for the Python programming language. We computed F and Fb as 
described in Sec.[2j using Synthetic LISA [2|)J to obtain the GW polarizations hi^^c)- 
Synthetic LISA was designed to perform highly accurate calculations of LISA's TDI 
responses for any gravitational waveform impinging on LISA (e.g., for burst waveforms 
it does not use the approximation that LISA is stationary over the timescale of the 
burst), but this generality and accuracy come at some cost in speed; we find that each 
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computation of F(tc) or Fsitc) takes 2-3 seconds on a ~ 3 GHz processor. Since 
single MCMC chains cannot be easily parallelized, we typically compute multiple 
chains, with each chain beginning in a different location in the parameter space. 

Given a data set, we find it useful to initially localize the bursts in time, at least 
roughly. To do this, we create a waveform template with arbitrary values for the sky 
position A) and / max , and compute F(t) for all possible times t using the standard 
inverse Fourier transform trick described in Sec. 2.2 The peaks of F(t) correspond 
to the best matches for the template in the data set. In a search on actual LISA 
data, we would need to carefully choose a detection threshold, to separate true GW 
bursts from random noise peaks. However, because MLDC 3.4 was the first challenge 
involving a search for cosmic strings in Mock LISA Data, the SNRs of the injected 
bursts were sufficiently high that the peaks from the bursts could be found in F(t) 
by eye. Because the sky-position for our template was arbitrary, the true values of tc 
(the arrival times of the signal at the SSB, not at LISA) could differ from the times 
^max that maximize F(t) by up to ~ 10 3 s. In practice, we narrowed the search to time 
windows tc G [i m ax — 2000 s, i max + 2000 s], using a longer-than-necessary window for 
additional safety. We use each t max as the starting point for a three-stage search: 

(i) For the first stage, we use the fact that the best-fit value of / max has only very weak 
dependence on the sky position (/3, A), so we choose a random sky position and 
perform a 1-D search over / max . Now, not all the MLDC 3.4 sources have a well- 
defined /max, which is chosen randomly (with uniformly distributed logarithm) 



between 10 3 and 10 Hz. (We noted in Sec. 2.4 that the true prior must scale as 



/max 3 , but rigorous verisimilitude was not a goal of this Challenge.) Thus, /max 
can be above the 0.5 Hz Nyquist frequency of the data set, in which case / ma x 
cannot be determined, other than to say that is > 0.5 Hz. For those signals with 
/max below Nyquist, we find that ~ 1,000 iterations are sufficient to obtain a very 
good estimate. 

(ii) We now fix / ma x to this value, and search over the sky position (j3,X). For this 
second stage, we use eight chains of ~ 1,000 iterations each, starting from different 
sky locations. Because of the reflection symmetry across the LISA plane for burst 
sources (see Sec. 3.1 1, two nearly equal local modes are found at this stage. For 
each mode, the point of highest probability among all chains is chosen as the 
starting point for the third stage of the search. 

(iii) In this final stage, we search over all three (/?, A,/ max ), restricting the MCMC 
proposal distribution to a very narrow Gaussian in order to explore only the 
immediate vicinity of the starting points. We generate one chain for each of the 
two modes, and define our best fit as the highest-probability point of both chains. 

We note that because of the computational limitations discussed above, none of 
our MCMC runs performed enough iterations to enter the regime of convergence. 
Therefore, we regard the chains as searches (maximizations) rather than explorations 
(integrations), and use the maxima attained by the chains as estimates of the true 
mode of the distributions. 



4.2. MultiNest 

MultiNest [T2J [TS] is a publicly available implementation of the nested-sampling 
algorithm for computing the Bayesian evidence of a model given a set of data. Nested 
sampling works by picking a set of N "live" points (typically 1,000) at random from 
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parameter space and then systematically replacing the point with the least P with a 
randomly chosen point^j] of higher P. In this way the set of live points is gradually 
attracted toward the modes of the distribution. As the algorithm proceeds, the number 
of random draws required to find a suitable replacement for the worst point tends to 
increase sharply. In order to alleviate this problem, MultiNest groups live points into 
ellipses, using the k- and cc-means point-clustering algorithms [37]. The ellipses are 
designed to identify and encompass the regions of parameter space that will attract a 
high concentration of live points. The proposed replacements are then drawn randomly 
not from the entire space, but from these ellipses. 

Nested sampling, like MCMC, provides a way to converge efficiently onto the 
(local) modes of a distribution. While this method was designed primarily to calculate 
the Bayesian evidence (an important concern to determine detection confidence for 
weak sources), we find that it also performs well at locating local maxima. Indeed, we 
found it relatively simple to implement a MultiNest-based search for cosmic-string 
bursts. Again, since we use the F-statistic and the FFT trick to maximize the 
likelihood over (A,ip,tc), we define P as e F or e Fs , and search on the remaining 
three parameters (/3, A, / max )- With 1,000 live points, we find that the code converges 
well after approximately 10,000 point replacements, or 10 replacements per live point. 

Since the probability function is identical to that used for our PyMC searches, 
the results from the two methods should be in good agreement. We found that this 
was indeed the case for both the training and challenge data. However, we prefer 
our MultiNest-based search, for several reasons. First, it is easily parallelized. While 
multiple CPUs can be used for multiple chains in MCMC, the long computation time 
for the log-likelihood results in none of our chains reaching the convergent regime in a 
reasonable run time. Although techniques such as parallel tempering and chain mixing 
increase the utility of a multi-chain approach, they require significantly longer chains 
than we were able to achieve given our choice to use exact templates (as computed 
with Synthetic LISA) rather than their static-LISA approximation. By comparison, 
we can easily leverage multiple CPUs for significant speed gains in MultiNest, where 
multiple candidate replacement points can be prepared in parallel, and unexamined 
candidates saved for later use. Second, since our MCMC chains do not reach the 
convergent regime (as discussed in Sec. 4.1), we are more confident in the results 
provided by the MultiNest algorithm, which does converge according to a well-defined 
criterion (a tolerance on the computed evidence). Finally, MultiNest performs well 
even without the somewhat elaborate three-stage procedure we use with PyMC. 



4-3. High-SNR limit and the Fisher-Matrix formalism 

For signals with sufficiently high SNR, the Fisher-matrix formalism provides a useful 
test of how accurately our codes are calculating the posterior probability. Consider a 
single burst immersed in noise, and imagine dialing up the burst's amplitude. As the 
SNR increases, the contour of constant likelihood that encloses a given fraction of the 
total probability (say, 68% for the 1-er contour) shrinks to encompass an ever smaller 
region of parameter space. (Actually, because of the discrete symmetry described in 
Sec. [3j in our case two disjoint contours shrink onto two distinct regions: one region 

1 This random choice must take into account the prior distributions of the parameters. Indeed, 
MultiNest requires that the n-dimensional parameter space first be mapped into the n-dimcnsional 
unit hypercube, from which MultiNest draws samples assuming a uniform distribution. Any non- 
uniform priors must be taken into account in this mapping. 



Searches for Cosmic- String Gravitational- Wave Bursts in Mock LISA Data 



20 



that is close to the true parameter values, and another that is related to it by reflection 
across the LISA plane.) The smaller the region, the better the log-likelihood function 
within the contour is described by a constant (the maximum value) plus the second 
partial derivative term (the Hessian) in a Taylor expansion. The matrix of partial 
second derivatives of the log-likelihood is given by 

- \d*d»{* - h | s - h> = (d„d„h | s - h) - r M „ , (31) 
where is the Fisher matrix |38j . defined by 



Here h(x") is the waveform (a function of all the parameters x"), (• ■ • | • • •) is the 
inner product defined in Eq. Q , and the partial derivatives are evaluated at the local 
maximum. [In a slight abuse of notation, we are using Greek indices to distinguish 
the Gamma matrix on the full parameter space from its restriction to the two- 



dimensional subspace (A 1 , A 2 ), which we defined as in Sec. 2.2 ] In the high-SNR 



limit, the posterior distribution function near a local mode approaches a Gaussian, and 
the second term on the right-hand side of Eq. (31) dominates, so by integration of a 
Gaussian exponential the covariance matrix of the parameters (restricted to parameter 
values near the given mode) approaches the inverse of the Fisher matrix. To wit: let 
be the local best-fit parameter values, let Ax" = x" — x%, and let Ax" Ax" be 
the posterior-weighted average of Ax" Ax" (where the averaging is restricted to a 
neighborhood of the given mode); then 

Ax" Ax" -> (r- 1 )^ as SNR^oo. (33) 

Thus, an especially simple test of the posterior distribution generated by our MultiNest 
runs is just to check that, for high SNR, the "variance factor" Ax" Aa^/fl 1-1 ) 
approaches one for all fj,. As our test case, we choose the strong source (#3) from 
the MLDC 3.4 noiseless training data set. As shown in the top plot in Fig. [5J near 
both modes the posterior distribution is more "banana-shaped" than ellipsoidal, so we 
would not expect the Fisher-matrix approximation to be very accurate. The bottom 
six plots in Fig. [8] show the posterior distribution for each parameter separately, and 
compare these with Gaussian distributions based on the inverse Fisher matrix. We 
see that in this case, for which the SNR is m 78, the marginalized posteriors do not 
have Gaussian shapes, and the Fisher matrix provides only a rough estimate of the 
actual variances; the variance factor ranges between 0.6 and 8.8. In Fig. [9] we show 
the posterior distribution for the same source, with an increased SNR s» 1,000. The 
agreement is much better. 

We regard Fig. [9] as additional confirmation that our search codes are working 
as expected. By contrast, we regard Fig. [S] as a warning that for LISA detections of 
string-bursts, even at SNR ~ 80, the Fisher-matrix approximation cannot be relied 
on to predict parameter-estimation errors accurately. 



5. Results from the Mock LISA Data Challenges 

The purpose of the MLDCs is stimulate the development and evaluate the performance 
of LISA data-analysis tools and methods. In each challenge, data sets containing 
simulated noise plus GW signals of undisclosed source parameters are made publicly 
available and all interested research groups are invited to test their algorithms on these 
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Figure 8. Comparison of MultiNest posterior distributions with Fisher-matrix 
estimates, in the case of the strongest source (#3) of the MLDC 3.4 noiseless 
training data. The top plot shows that the posterior distribution on the sky 
is more "banana-shaped" than ellipsoidal. The next six plots compare the true 
posterior distribution (restricted to the neighborhood of the "true" mode) with 

Gaussian distributions of variance = (T ) . The variance factor, defined 
as ""flt/^Fishor' ran S es between 0.6 and 8.8. 



blind challenge data. Each challenge includes also training data sets with published 
source parameters, to help groups develop and calibrate their codes. The MLDCs are 
becoming more realistic with each new challenge, encompassing a larger number and 
variety of sources. 

The third MLDC was the first to include a search for bursts from cosmic strings, 
MLDC 3.4. This data set consisted of 2 21 samples with a cadence of 1 s (for a total 
of ~ 1 month), and it included a few randomly chosen string-burst signals injected 
into purely instrumental noise (i.e., the data set did not include signals from other 
types of sources, or the confusion noise from unresolvable Galactic binaries). The sky 
positions of the injected sources were chosen randomly from a uniform sky distribution; 
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Figure 9. Same as Fig. [8] except that the source's SNR is now 1,000. In this 
case, the posterior is fit very well by the Fisher-matrix prediction. Even at this 
high SNR, a secondary maximum is present around the reflected location, but it 
is not shown in this plot. 



the polarizations 4> were drawn uniformly from [0, 7r]; and / max was drawn [10 -3 , 10] 
Hz with a uniformly distributed logarithm. 

MLDC 3.4 called for a random number (a Poisson deviate of mean 5) of injected 
bursts, with SNRs drawn uniformly from [10, 100]. As discussed in Sec. 2.3 these priors 
for / max and SNR are not astrophysically realistic, but the intent for this challenge 
was less to maintain astrophysical realism than to test search algorithms for a wide 
range of source parameters (i.e, a wider range than one would obtain from a handful 
of detections with realistic parameters). As it turned out, the MLDC 3.4 data set 
contained exactly three string bursts, all with SNRs in the range 36-45. Of course, 
the realistic expectation is that most detections will have SNRs within 50% of the 
detection threshold, which is likely to be ~ 6. Thus, all the MLDC 3.4 bursts had 
SNRs a factor 4-5 higher than will be typical. 
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In this challenge, the exact spectral densities of instrumental noise were 
randomized and undisclosed, but they were guaranteed to lie within fairly narrow 
ranges. In our searches, we ignored this feature, to little apparent damage, by taking 
the TDI observables to have the standard MLDC noise spectral densities as assumed 
in the other MLDC challenges. Explicit expressions for these SU(/), SeH) and Sx(f) 
are given in |28) . 

In our entries to MLDC 3.4 and in this paperj^Jwe report the best-fit parameters 
found by our searches (i.e., the maxima of F or Fb). In fact, because there are 
always two parameter sets that fit the data almost equally well, due to the reflection 
symmetry described in Sec. |3.1| for each burst we report the best-fit parameters of both 
modes. Table [T] lists the true and best-fit parameters, and Table [2] the corresponding 
estimation errors; Figure [l0| shows sky plots of the posterior distributions derived from 
our MultiNest searches. 

Certain aspects of the results presented in Table 1 and Figure 10 require 
clarification. For Source #0, the true sky location is ruled out by parameter 
estimation. This should not be surprising: in the high-SNR regime, the variance 
of SNR 2 over the ensemble of noise realizations is of the order of the number of source 
parameters; thus the likelihood at the best-fit parameters can exceed the likelihood 
at the true parameters by large exponential factors. For Source #1, we find that the 
maximum of Fb lies outside the two regions of the sky where the posterior probability 
is concentrated. In the Table we report instead on the maxima that lie within the 
large, high-probability clusters. The outlying maximum lies close to the LISA plane, 
and so it resembles the mirages discussed in Sec. |3.3| In this case, however, the 
best-fit amplitude is only a factor of two higher than the true value, so the Bayesian 
correction term implicit in Fb does not strongly disfavor it. For Source #0, MultiNest 
converged to values of / max above the Nyquist frequency, although one of the MCMC 
chains managed to lock onto a better value. 

In summary, we find that both the PyMC and MultiNest searches perform well 
at locating the peaks of the posterior, and that the best fits found by the two methods 
are mostly consistent. In this sense, both techniques are successful. However, because 



of the broad degeneracy of the posterior across the sky (described in Sec. 3.2 ), we find 
that instrument noise will generally shift the best-fit parameters rather far from their 
true values. Because the LISA response introduces strong correlations between sky 
position and the parameters (A, ip, t c ), these come to have large errors as well. Thus, 
we should not hope for accurate sky locations in LISA detections of string bursts with 
SNR ~ 40, and the situation will only be worse for typical LISA detections with SNR 

We emphasize that we believe that these large parameter-estimation errors are 
not a result of bugs or lack of convergence in our search methods, but are simply the 
consequence of the broad parameter-space degeneracy of cusp-burst signals. Besides 
the consistency between our PyMC and MultiNest results, we performed an additional 
test by verifying that parameter-estimation accuracy improves when we boost the SNR 
to ~ 1,000, as shown in Table [3] for source #3 in the noisy MLDC 3.4 training data 

The values shown in this paper are somewhat different from the values wc submitted for MLDC 
3.4, which can be viewed at www.tapir.caltech.edu/mldc Our algorithms have improved since the 
conclusion of MLDC 3, and to keep this paper current with our research effort, here we have chosen 
to report our newer results. In some cases, our newer best-fit parameters are actually further from 
the true parameter values than our original entries. Nevertheless, the values reported here arise from 
a more correct analysis of the data. 
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Table 1. True source parameter values and MCMC and MultiNcst best fits for 
the MLDC 3.4 challenge data set. When the estimated /max is larger than the 
0.5 Hz Nyquist frequency. 



parameter I true value I MCMC #1 MCMC #2 MN #1 MN #2 







Source 






fj [I dj(_lj 




0.551 


0.119 


n ^a^ 


u.yoo 


A [rad] 


3.711 


5.843 


0.005 


5.858 


5.295 


/max [Hz] 


0.030 


> 0.5 


0.044 


> 0.5 


> 0.5 


tp [rad] 


3.319 


2.936 


2.776 


2.926 


1.914 


A [1(T 21 ] 


0.86636 


3.0368 


1.1394 


2.903 


3.142 


t c [10 6 s] 


1.60216 


1.60288 


1.60305 


1.60289 


1.60265 


SNR 


44.610 


44.985 


44.842 


44.987 


44.993 






Source 1 








fl AAA 


-0.753 


0.256 


u.uoo 


n 991 


A [rad] 


3.167 


0.015 


3.486 


0.076 


3.502 


/max [Hz] 


0.0010842 


0.0010927 


0.0010932 


0.001087 


0.001085 


ip [rad] 


5.116 


4.233 


5.023 


4.275 


5.019 


A [W- 21 ] 


2.7936 


1.6528 


1.6585 


1.621 


1.688 


t c [10 6 s] 


1.07269 


1.07349 


1.07266 


1.07352 


1.07265 


SNR 


36.691 


36.704 


36.702 


36.703 


36.704 






Source 2 






P [rad] 


-0.800 


0.179 


1.154 


0.141 


1.176 


A [rad] 


0.217 


0.271 


2.746 


0.259 


2.876 


/max [Hz] 


6.1495 


0.030 


0.025 


0.026 


0.030 


■0 [rad] 


4.661 


4.631 


5.225 


4.630 


5.129 


.4 [io- 21 ] 


0.85403 


1.0319 


1.0285 


1.007 


1.016 


t c [10 6 s] 


0.60001 


0.60015 


0.59949 


0.60015 


0.59949 


SNR 


41.378 


41.497 


41.496 


41.495 


41.496 



set. For such high SNR, the MultiNest best-fit parameters are reassuringly close to 
the true values. 

6. Summary, conclusions and future work 

In this paper we have reported on our work to develop two string-burst search pipelines, 
which rely on the ^-statistic and the FFT to efficiently maximize the likelihood over 
(.4, ip) and tc, respectively, and which are based on the publicly available PyMC and 
MultiNest libraries to maximize over the remaining parameters (l3,X,f max ). Both 
of our pipelines proved reasonably efficient (MultiNest more so, due to greater gains 
from parallelization) . We tested our searches by checking that they yielded mutually 
consistent best fits, and that posteriors results agreed with Fisher-matrix estimates for 
sufficiently large SNR. Given the relative simplicity of string-burst signals, we expected 
that off-the-shelf optimization codes like PyMC and MultiNest be would sufficiently 
powerful for this search, which our work has verified. 

Although the few string-burst injections in MLDC 3.4 had all SNR ~ 40, it did 
not prove possible to localize them on the sky to better than <~ one radian. We 
showed that this result is just what should be expected, on the basis of the broad 
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Table 2. Differences between true source parameter values and MCMC and 
MultiNest best fits, for the MLDC 3.4 challenge data set. The Asky error is 
measured in radians along the geodesic arc between the true and best-fit sky 
positions. 



parameter 


MCMC #1 


MCMC #2 


MN #1 


MN #2 








Source 






Asky 


[rad] 


1.680 


2.278 


1.695 


1.140 


A log 10 /max 




> 1.222 


0.169 


> 1.222 


> 1.222 


Aip 


[rad] 


0.383 


0.543 


0.394 


1.405 


A log .4 




1.254 


0.274 


1.209 


1.288 


At c 


M 


716.38 


881.18 


722.40 


485.39 


ASNR 




0.375 


0.232 


0.378 


0.383 








Source 1 






Asky 


[rad] 




U. / DD 


z.Uoy 


O 7 AO 


A l0g 1Q /max 




3.37 x 10~ 3 


3.59 x 10~ 3 


1.270 x 10 -3 


4.083 x 10~ 4 


Aip 


[rad] 


0.884 


0.093 


0.842 


9.758 x 10~ 2 


A log .4 




0.525 


0.521 


0.544 


0.504 


At c 


M 


794.28 


41.06 


828.39 


43.95 


ASNR 




0.014 


0.011 


0.012 


0.013 








Source 2 






Asky 


[rad] 


0.980 


2.662 


0.942 


2.690 


A l0g 1Q /max 




2.316 


2.396 


2.377 


2.318 


Aip 


[rad] 


0.030 


-0.564 


0.031 


0.467 


Alog„4 




0.189 


0.186 


0.165 


0.174 


At c 


M 


141.40 


519.79 


145.06 


522.02 


ASNR 




0.119 


0.118 


0.117 


0.118 



Table 3. Parameter accuracy achieved by MultiNest for source #3 in the MLDC 
3.4 training data set, with the original and boosted SNR. 



parameter 


true value 


boosted 


best fit 


best fit (boosted) 


/3 [rad] 


0.239 




-0.036 


0.247 


A [rad] 


1.090 




1.204 


1.092 


/max [Hz] 


1.152 x 10~ 2 




1.161 x 10~ 2 


1.151 x 10~ 2 


ip [rad] 


0.399 




0.571 


0.394 


A [10- 21 ] 


2.647 


37.26 


2.204 


37.26 


t c [10 6 s] 


2.060273 




2.060245 


2.060272 


SNR 


78.122 


1082.9278 


78.137 


1082.9291 



parameter 



error error (boosted) 

8.617 x 10~ 3 

1.2 x 10 -4 
4.9 x 10~ 3 
0.0125 
1.45 

1.3 x 10~ 3 



Asky 

A l0g 1() /„ 

Aip 

A log .4 

At c 
ASNR 



[rad] 
[rad] 
[s] 



0.297 
3.6 x 10" 
0.171 
0.183 
27.89 
0.015 
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Figure 10. MultiNest sky-location posteriors for sources 0-2 in the MLDC 
3.4 challenge data set. The density of the dots is proportional to the posterior 
probability (including the Fg prior correction described in Sec. |2.3| , maximized 
over tc, and marginalized over A., tp, and /max- Crosses and circles indicate 
the true and best-fit locations, respectively. For source 1, the stars indicate the 
location of mirage best fits discarded by Fb- 



degeneracy illustrated by the fitting-factor maps of Sec. 3.2 Determinations of A 



and ip are correspondingly poor — again to be expected, since these parameters are 
strongly correlated with the sky location in the signal measured by LISA. While so far 
we have analyzed only a handful of bursts in detail, there is every reason to presume 
that poor parameter-estimation accuracy will be a robust feature of LISA string- 
burst detections. In future work, we intend to verify or disprove this presumption 
by analyzing a much larger sample of bursts drawn from an astrophysically sensible 
distribution. This paper also included: 

(i) the proof of the near-degeneracy between (linearly polarized) burst signals from 
directions that are reflections of each other across the LISA plane (which had 
been noted elsewhere, but heretofore not explained analytically); 

(ii) the first detailed look at string-burst fitting factors as a function of sky position, 
revealing very high FF over a large fraction of the sky; 

(iii) the analysis of four discrete symmetries (three of which not previously discussed) 
between different fitting function maps; 

(iv) the derivation of an approximate, easily computed Bayesian version of the F- 
statistic, based on realistic priors; 
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(v) a calculation of the expected distribution of / max for detected bursts. 

We envisage two broad directions for future work. First, so far we have 
concentrated on finding the physical parameters of a single string-burst. Using these 
sorts of results as input, the next step will be to determine how well LISA can answer 
questions about the string network (e.g., are there different types of strings? What 
are /j, and p for each class?) based on an observed population of string-bursts, plus any 
information from a cosmic-string stochastic background. Second, so far our searches 
have been designed for single bursts in Gaussian noise of known spectral density. We 
need to generalize our methods to the cases where the noise level and shape are not 
precisely known (and so must be determined from the data), and where the burst 
signals are superimposed on a realistic LISA data set containing confusion noise from 
millions of individually unresolvable sources (mostly white-dwarf binaries) plus tens 
of thousands of resolvable signals from a variety of sources (especially white-dwarf 
binaries, EMRIs, and merging massive black binaries). 
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Appendix A. Proof of the fourth FF-map symmetry 

A simple way to see this is to consider a one "arm" or a simple-Michelson TDI response 
(this entails no loss of generality, since Michelson TDI variables are a basis for all 
possible observables |28) . and the derivation would proceed very similarly for first- 
and second-generation TDI Michelson variables). For instance, using the notation of 
and of Eq. (|30]), consider 

armi 2 (fc,?M) = 2/231 (t) + 2/i3'2(* - L) 

1 n 3 ■ [h(t - k ■ pi) - h(t - L - k ■ p 2 )} ■ n 3 



2 1 - k ■ n 3 

1 n 3 * ■ [h(t — L — k ■ P2) — h(t — 2L — k ■ pi)] ■ n 3 > 



(A.l) 



2 1 - k ■ n 3 / 

Now n 3 = — ri3', and the dot product of n 3 ® ^3 with the polarization tensor for a 
linearly polarized plane GW with (fc, ip) and (— k, —ip) can be seen to be the same using 
the formulas of Appendix A]. Let us then drop those products, and concentrate 
on the time arguments of the h, as well as the geometric projection factors 1 — k ■ nj. 
Now we let k — ¥ —k, exchange 71-3 with — ny in the denominator, and time-advance 
the whole expression by 2L: 

h(t + 2L + k- Pl )-h(t + L + k-p 2 ) h(t + L + k ■ p 2 ) - h(t + k ■ Pl ) 

1 1 ' 1 ; ' ( A -^j 

1 — k ■ fly 1 — k ■ n 3 
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after time-inverting the argument of the h (without loss of generality, let tc = 0), we 
can match the terms one by one with the original expression, yielding, Q.E.D., 

armi2(— k, —ip; t + 2L) = — armi2(fc, ip; —t). (A. 3) 
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